QNNVerifier是第一个用于验证神经网络实现的开源工具,以考虑其操作数的有限字长(即量化)。通过采用最先进的软件模型检查(SMC)技术来实现对量化的新颖支持。它将神经网络的实现基于可满足模数理论(SMT)来将神经网络的实现到一阶逻辑的可解除片段。通过给定硬件确定的精度,通过直接实现来表示固定和浮点操作的影响。此外,Qnnverifier允许指定定制安全性能,并使用不同的验证策略(增量和K-Incuction)和SMT求解器来验证所产生的模型。最后,QNNVerifier是第一个通过间隔分析和非线性激活功能的离散化来组合不变推论的工具,以加快级别验证神经网络的级数。 qnnverifier的视频呈现可在https://youtu.be/7jmgol41zty中获得
translated by 谷歌翻译
Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.
translated by 谷歌翻译
Training agents via off-policy deep reinforcement learning (RL) requires a large memory, named replay memory, that stores past experiences used for learning. These experiences are sampled, uniformly or non-uniformly, to create the batches used for training. When calculating the loss function, off-policy algorithms assume that all samples are of the same importance. In this paper, we hypothesize that training can be enhanced by assigning different importance for each experience based on their temporal-difference (TD) error directly in the training objective. We propose a novel method that introduces a weighting factor for each experience when calculating the loss function at the learning stage. In addition to improving convergence speed when used with uniform sampling, the method can be combined with prioritization methods for non-uniform sampling. Combining the proposed method with prioritization methods improves sampling efficiency while increasing the performance of TD-based off-policy RL algorithms. The effectiveness of the proposed method is demonstrated by experiments in six environments of the OpenAI Gym suite. The experimental results demonstrate that the proposed method achieves a 33%~76% reduction of convergence speed in three environments and an 11% increase in returns and a 3%~10% increase in success rate for other three environments.
translated by 谷歌翻译
Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no gains in comparison to a simpler retriever such as BM25 on out-of-domain tasks. The code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git
translated by 谷歌翻译
The usage of deep neural networks in safety-critical systems is limited by our ability to guarantee their correct behavior. Runtime monitors are components aiming to identify unsafe predictions and discard them before they can lead to catastrophic consequences. Several recent works on runtime monitoring have focused on out-of-distribution (OOD) detection, i.e., identifying inputs that are different from the training data. In this work, we argue that OOD detection is not a well-suited framework to design efficient runtime monitors and that it is more relevant to evaluate monitors based on their ability to discard incorrect predictions. We call this setting out-ofmodel-scope detection and discuss the conceptual differences with OOD. We also conduct extensive experiments on popular datasets from the literature to show that studying monitors in the OOD setting can be misleading: 1. very good OOD results can give a false impression of safety, 2. comparison under the OOD setting does not allow identifying the best monitor to detect errors. Finally, we also show that removing erroneous training data samples helps to train better monitors.
translated by 谷歌翻译
随着机器学习(ML)在关键自主系统中的越来越多的使用,已经开发出运行时监视器来检测预测错误并使系统在操作过程中保持安全状态。已经提出了针对涉及各种感知任务和ML模型的不同应用,并将监视器进行了监视,并将特定的评估程序和指标用于不同的环境。本文介绍了三个统一面向安全的指标,代表了监视器的安全益处(安全增益),使用后的剩余安全差距(残留危险)以及对系统性能(可用性成本)的负面影响。要计算这些指标,需要定义两个返回功能,代表给定的ML预测如何影响预期的未来奖励和危害。三个用例(分类,无人机登陆和自动驾驶)用于证明如何根据建议的指标来表示文献的指标。这些示例的实验结果表明,不同的评估选择如何影响监视器的感知性能。由于我们的形式主义要求我们制定明确的安全假设,因此它使我们能够确保进行评估与高级系统要求符合。
translated by 谷歌翻译
我们介绍MR-NET,这是一种用于多分辨率神经网络的一般体系结构,也是基于此体系结构进行成像应用的框架。我们的基于坐标的网络在空间和规模上都是连续的,因为它们由多个阶段组成,这些阶段逐渐增加了更细节。除此之外,它们是一个紧凑而有效的表示。我们展示了多分辨率图像表示以及用于纹理放大和缩小以及抗脉化的应用。
translated by 谷歌翻译
雨林在全球生态系统中起着重要作用。但是,由于几个原因,它们的重要区域正面临森林砍伐和退化。创建了各种政府和私人计划,以监视和警报遥感图像增加森林砍伐的增加,并使用不同的方式处理显着的生成数据。公民科学项目也可以用于实现相同的目标。公民科学由涉及非专业志愿者进行分析,收集数据和使用其计算资源的科学研究组成,并在科学方面取得进步,并提高公众对特定知识领域的问题的理解,例如天文学,化学,数学和物理学。从这个意义上讲,这项工作提出了一个名为Foresteyes的公民科学项目,该项目通过对遥感图像的分析和分类来使用志愿者的答案来监视雨林中的森林砍伐区域。为了评估这些答案的质量,使用来自巴西法律亚马逊的遥感图像启动了不同的活动/工作流程,并将其结果与亚马逊森林砍伐监测项目生产的官方地面图进行了比较。在这项工作中,在2013年和2016年围绕着Rond \^onia州的前两个工作流程收到了35,000美元以上的$ 383 $志愿者的答复,$ 2,050 $ 2,050 $在发布后仅两周半就创建了任务。对于其他四个工作流程,甚至封闭了同一区域(Rond \^onia)和不同的设置(例如,图像分割方法,图像分辨率和检测目标),他们收到了$ 51,035美元的志愿者的答案,从$ 281的志愿者收取的$ 3,358 $ $ 3,358 $任务。在执行的实验中...
translated by 谷歌翻译
太阳能动力学天文台(SDO)是NASA多光谱十年的长达任务,每天都在日常产生来自Sun的观测数据的trabytes,以证明机器学习方法的潜力并铺路未来深空任务计划的方式。特别是,在最近的几项研究中提出了使用图像到图像翻译实际上产生极端超紫罗兰通道的想法,这是一种增强任务较少通道的提高任务的方法,并且由于低下链接而减轻了挑战。深空的速率。本文通过关注四个通道和基于编码器的建筑的排列来研究这种深度学习方法的潜力和局限性,并特别注意太阳表面的形态特征和亮度如何影响神经网络预测。在这项工作中,我们想回答以下问题:可以将通过图像到图像翻译产生的太阳电晕的合成图像用于太阳的科学研究吗?分析强调,神经网络在计数率(像素强度)上产生高质量的图像,通常可以在1%误差范围内跨通道跨通道重现协方差。但是,模型性能在极高的能量事件(如耀斑)的对应关系中大大减少,我们认为原因与此类事件的稀有性有关,这对模型训练构成了挑战。
translated by 谷歌翻译
迭代创建像素艺术角色精灵板对于游戏开发过程至关重要。但是,直到完成包含不同姿势和动画片段的最终版本之前,可能需要大量精力。本文使用条件生成的对抗网络调查,以帮助设计师创建此类精灵片。我们提出了一个基于Pix2Pix的体系结构,以生成面向目标侧(例如,右)的字符图像(例如,右)在源姿势中(例如,前面)。使用小像素ART数据集的实验产生了令人鼓舞的结果,导致模型具有不同程度的概括,有时能够生成非常接近地面真相的图像。我们通过视觉检查和FID进行定量分析结果。
translated by 谷歌翻译